33 research outputs found
Towards Knowledge Discovery from the Vatican Secret Archives. In Codice Ratio -- Episode 1: Machine Transcription of the Manuscripts
In Codice Ratio is a research project to study tools and techniques for
analyzing the contents of historical documents conserved in the Vatican Secret
Archives (VSA). In this paper, we present our efforts to develop a system to
support the transcription of medieval manuscripts. The goal is to provide
paleographers with a tool to reduce their efforts in transcribing large
volumes, as those stored in the VSA, producing good transcriptions for
significant portions of the manuscripts. We propose an original approach based
on character segmentation. Our solution is able to deal with the dirty
segmentation that inevitably occurs in handwritten documents. We use a
convolutional neural network to recognize characters and language models to
compose word transcriptions. Our approach requires minimal training efforts,
making the transcription process more scalable as the production of training
sets requires a few pages and can be easily crowdsourced. We have conducted
experiments on manuscripts from the Vatican Registers, an unreleased corpus
containing the correspondence of the popes. With training data produced by 120
high school students, our system has been able to produce good transcriptions
that can be used by paleographers as a solid basis to speedup the transcription
process at a large scale.Comment: Donatella Firmani, Marco Maiorino, Paolo Merialdo, and Elena Nieddu.
2018. Towards Knowledge Discovery from the Vatican Secret Archives. In Codice
Ratio - Episode 1: Machine Transcription of the Manuscripts. In Proceedings
of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data
Mining (KDD '18). ACM, New York, NY, USA, 263-27
In Codice Ratio: using VREs in the study of the medieval Vatican registers
In Codice Ratio is a research project that aims to develop novel methods of supporting content analysis and knowledge discovery from large collections of historical documents. The goal is to provide humanities scholars with novel tools to conduct data-driven studies based on large historical sources. We are currently working on the collection of the medieval Vatican registers, dwelling in particular on documents and letters drawn up under the pontificate of Honorius III (1216–1227
In Codice Ratio: Machine Transcription of Medieval Manuscripts
Our project, In Codice Ratio, is an interdisciplinary research
initiative for analyzing content of historical documents conserved in the
Vatican Secret Archives (VSA). As most of such documents are digitized
as images, Machine Transcription is both an enabler to the application
of Knowledge Discovery techniques, as well as a useful tool to the
paleographer for speeding up the transcription process. Our approach
involves a convolutional neural network to recognize characters, statistical
language models to compose and rank word transcriptions, and
crowdsourcing for scalable training data collection. We have conducted
experiments on pages from the medieval manuscript collection known as
the Vatican Registers. Our results show that almost all the considered
words can be transcribed without significant spelling errors
In codice ratio: OCR of handwritten Latin documents using deep convolutional networks
Automatic transcription of historical handwritten documents is a challenging research problem, requiring in general expensive transcriptions from expert paleographers. In Codice Ratio is designed to be an end-to-end architecture requiring instead limited labeling effort, whose aim is the automatic transcription of a portion of the Vatican Secret Archives (one of the largest historical libraries in the world). In this paper, we describe in particular the design of our OCR component for Latin characters. To this end, we first annotated a large corpus of Latin characters with a custom crowdsourcing platform. Leveraging over recent progresses in deep learning, we designed and trained a deep convolutional network achieving an overall accuracy of 96% over the entire dataset, which is one of the highest results reported in the literature so far. Our training data are publicly available
Multi-residue analysis of eight thioamphetamine designer drugs in human urine by liquid chromatography/tandem mass spectrometry
An analytical procedure for the simultaneous determination in human urine of several thioamphetamine designer drugs (2C-T and ALEPH series) is reported. The quantitative analysis was performed by liquid chromatography/tandem mass spectrometry and has been fully validated. The mass spectrometer was operated in positive-ion, selected reaction monitoring (SRM) mode. In order to minimize interferences with matrix components and to preconcentrate target analytes, solid-phase extraction was introduced in the method as a clean-up step. The entire method was validated for selectivity, linearity, precision and accuracy. The method turned out to be specific, sensitive, and reliable for the analysis of amphetamine derivatives in urine samples. The calibration curves were linear over the concentration range of 1 to 100 ng mL-1 for all drugs with correlation coefficients that exceeded 0.996. The lower limits of detection (LODs) and quantification (LOQs) ranged from 1.2 to 4.9 ng mL-1 and from 3.2 to 9.6 ng mL-1, respectively
Tuning Au(I)···Tl(I) interactions via mixed thia- aza-macrocyclic ligands: effects on the structural and luminescence properties
Reaction of the heterometallic complexes [{Au(C6X5)2}Tl]n (X = Cl, F) with equimolecular amounts of the N,S-mixed-donor crown ethers [12]aneNS3 or [12]aneN2S2 affords the new Au(I)/Tl(I) derivatives [{Au(C6Cl5)2}{Tl(L)}2][Au(C6Cl5)2] [L = [12]aneNS3 (1), [12]aneN2S2 (2)], [{Au(C6F5)2}Tl([12]aneNS3)]2 (3), or [{Au(C6F5)2}Tl([12]aneN2S2)]n (4). These complexes display the same Au/Tl metal ratio, but different structural arrangements. While the chlorinated derivatives 1 and 2·2THF display an ionic structure, the crystal structure of 3 contains neutral tetranuclear Au2Tl2 units, and complex 4 displays a polymeric nature and is the only one that does not show unsupported Au···Tl interactions. The lack of this interaction is responsible for the absence of luminescence in this last case. The optical properties of 1 and 3 in the solid state have been studied experimentally and theoretically, concluding that their luminescence has its origin in the Au···Tl interactions, and this is also influenced by their number and strength. DFT and TD-DFT theoretical calculations on model systems of complexes 1, 3, and 4 have been carried out in order to confirm the origin of their luminescence or its absence, as well as to justify their emission energies in spite of their different solid state structures
In Codice Ratio: Scalable Transcription of Historical Handwritten Documents
Huge amounts of handwritten historical documents are being published by digital libraries world wide. However, for these raw digital images to be really useful, they need to be annotated with informative content. State-of-the-art Handwritten Text Recognition (HTR) approaches require an impressive training effort by expert paleographers. Our contribution is a scalable, end-to-end transcription work-flow – that we call In Codice Ratio – based on fine-grain segmentation of text elements into characters and symbols, with limited training effort. We provide a preliminary evaluation of In Codice Ratio over a corpus of letters by pope Honorii III, stored in the Vatican Secret Archive
Cardiac β<sub>1</sub>-adrenoceptor expression in two stress-induced cardiomyopathy-related deaths
Stress-induced cardiomyopathy (SICM) is characterized by transient systolic dysfunction of the apical and/or midventricular myocardial segments in the absence of obstructive coronary artery disease and is unique in that it can manifest itself after acute emotional stress. Excessive amounts of catecholamines released from sympathetic nerve endings as well as from the adrenal medulla under stressful conditions are considered to produce intracellular Ca2+ overload and cardiac dysfunction through the β1-adrenoceptor signal transduction pathway. We describe the clinical and pathomorphological findings in two stress-induced cardiomyopathy fatal cases. Levels of catecholamines and their metabolites in urine samples were assessed too. Morphological patterns seen in SICM result from the complex interplay between sympathetic innervations, β-receptor density and function and catecholamine sensitivity
Influence of the number of metallophilic interactions and structures on the optical properties of heterometallic Au/Ag complexes with mixed-donor macrocyclic ligands
The reactivity of the polymeric gold(I)/silver(I) compound [Au2Ag2(C6F5)4(OEt2)2]n toward the 12-membered mixed-donor macrocyclic ligands 1,7-diaza-4,10-dithiacyclododecane (L1), 1-aza-4,7,10-trithiacyclododecane (L2), N-quinolinylmethyl-1-aza-4,7,10-trithiacyclododecane (L3), and N,N′-bis(quinolinylmethyl)-1,7-diaza-4,10-dithiacyclododecane (L4) was studied. The reactions were carried out using different molar ratios depending on the coordination properties of the ligands, which were modified by changing the donor atoms present in the macrocyclic framework (sulfur or nitrogen) or by linking one or two methylquinoline pendant-arms at the secondary nitrogen atom(s). X-ray diffraction analysis of the new complexes obtained show a nuclearity that increases on increasing the number of donor atoms in the ligands. The rich structural diversity observed determines different optical responses when the complexes are irradiated with UV-vis light in the solid state and in THF solution. The study of the optical properties reveals that in complexes for which the luminescence is due to metal-metal interactions, higher emission wavelengths are observed as the number of these metallophilic contacts increases, while the luminescence of ionic complexes has its origin in the macrocyclic ligands. TD-DFT calculations were carried out to verify the origin of these interesting structural-optical property relationships